Outline

 

Part 1
Drawing a CONSORT flow diagram

 

Part 2
Data exploration

Part 1: Drawing a CONSORT flow diagram

CONSORT flow diagram

Does treatment of COVID-19 with paxlovid reduce hospitalization and death?

Biden has begun taking Paxlovid, an antiviral medication shown to prevent serious Covid cases

 

Hammond J, Leister-Tebbe H, Gardner A, et al. Oral Nirmatrelvir for High-Risk, Nonhospitalized Adults with Covid-19. N Engl J Med. 2022;386(15):1397-1408.

Hammond J, Leister-Tebbe H, Gardner A, et al. Oral Nirmatrelvir for High-Risk, Nonhospitalized Adults with Covid-19. N Engl J Med. 2022;386(15):1397-1408.

Hammond J, Leister-Tebbe H, Gardner A, et al. Oral Nirmatrelvir for High-Risk, Nonhospitalized Adults with Covid-19. N Engl J Med. 2022;386(15):1397-1408.

A data frame for the paxlovid CONSORT diagram

dim(p)
## [1] 2383   10
head(p)
##    rand  exc1  exc2  oth1    group  died  dis1  dis2  dis3    fu
## 1 FALSE FALSE  TRUE FALSE     <NA>    NA    NA    NA    NA FALSE
## 2  TRUE FALSE FALSE FALSE  placebo FALSE FALSE FALSE FALSE  TRUE
## 3  TRUE FALSE FALSE FALSE paxlovid FALSE FALSE FALSE  TRUE FALSE
## 4  TRUE FALSE FALSE FALSE paxlovid FALSE FALSE FALSE FALSE  TRUE
## 5  TRUE FALSE FALSE FALSE  placebo  TRUE FALSE FALSE FALSE FALSE
## 6 FALSE  TRUE FALSE FALSE     <NA>    NA    NA    NA    NA FALSE

A single-layer tree

table(p$rand)
## 
## FALSE  TRUE 
##   137  2246
library(vtree)
vtree(p, "rand", horiz=FALSE)

A two-layer tree

vtree(p, "rand group", horiz=FALSE)

Pruning a node

vtree(p, "rand group", follow=list(rand=TRUE), horiz=FALSE)

A three-layer tree

vtree(p, "rand group fu", follow=list(rand=TRUE), horiz=FALSE)

Adding labels

vtree(p, "rand group fu", follow=list(rand=TRUE), horiz=FALSE, showvarnames=FALSE, title="Assessed for eligibility",
  labelnode=list(rand=c(Excluded=FALSE, Randomized=TRUE), fu=c("Discontinued trial"=FALSE, "Followed up"=TRUE)))

Summaries in vtree

The summary parameter lets you display information about other variables within a subset

summary="variable_name format_string"

where format_string consists of text and special codes.

 

For example, how many patients were excluded because they didn’t meet eligibility criteria (exc1)?

summary="exc1 Did not meet eligibility criteria: %sum%"

In the CONSORT diagram this generates the output:

        Did not meet eligibility criteria: 124

A detailed CONSORT diagram

vtree(p, "rand group fu", follow=list(rand=TRUE), horiz=FALSE, showvarnames=FALSE, title="Assessed for eligibility", 
  labelnode=list(rand=c(Excluded=FALSE, Randomized=TRUE), fu=c("Discontinued trial"=FALSE, "Followed up"=TRUE)),
  summary=c("exc1 \nDid not meet eligibility criteria: %sum%%var=rand%%node=FALSE%",
    "exc2 \nWithdrew: %sum%%var=rand%%node=FALSE%", "oth1 \nHad other reason: %sum%%var=rand%%node=FALSE%",
    "died \nDied: %sum%%var=fu%%node=FALSE%", "dis1 \nLost to follow-up: %sum%%var=fu%%node=FALSE%",
    "dis2 \nWithdrew: %sum%%var=fu%%node=FALSE%", "dis3 \nHad other reason: %sum%%var=fu%%node=FALSE%"),
  splitwidth=Inf, fillcolor="white", rootfillcolor="white", showpct=FALSE)

Part 2: Data Exploration

Spreadsheet magnifier

A retrospective cohort study

Here’s a dataset from the medicaldata package, as reported in this paper:

Sobecks et al. “Cytomegalovirus Reactivation After Matched Sibling Donor Reduced-Intensity Conditioning Allogeneic Hematopoietic Stem Cell Transplant Correlates With Donor Killer Immunoglobulin-like Receptor Genotype”. Exp Clin Transplant 2011; 1: 7-13.

It’s about the risk of cytomegalovirus (CMV) reactivation in patients being treated with bone marrow stem cell transplant

library(medicaldata)
dim(cytomegalovirus)
## [1] 64 26

Frequencies

vtree(cytomegalovirus,"diagnosis",pattern=TRUE)

For convenience, I’ll assign some interpretable factor levels

library(forcats)
library(dplyr)
stemcell <- cytomegalovirus %>%
  mutate(
    CMV              = fct_recode(as.character(cmv), "CMV reactivation"="1", "No CMV reactivation"="0"),
    sex              = fct_recode(as.character(sex), Male="1", Female="0"),
    race             = fct_recode(as.character(race), White="1", "African American"="0"),
    diagnosis.type   = fct_recode(as.character(diagnosis.type), Myeloid="1", Lymphoid="0"),
    prior.transplant = fct_recode(as.character(prior.transplant), "No prior transplant"="0", "Prior transplant"="1"),
    prior.radiation  = fct_recode(as.character(prior.radiation), "No radiation"="0", "Radiation"="1"),
    c1c2             = fct_recode(as.character(`C1/C2`), Heterozygous="0", Homozygous="1"))  

Patterns

vtree(stemcell, "sex c1c2 prior.radiation", pattern=TRUE, showvarnames=FALSE)

CMV reactivation (R) post-transplant

vtree(stemcell, "sex c1c2 prior.radiation", sameline=TRUE, showvarnames=FALSE, summary="cmv   \nR %npct%", splitwidth=Inf)

Quantitative variables

Although vtree focuses on discrete variables, it can display summaries of quantitative variables

The primary risk factor of interest was the number of activating killer immunoglobulin-like receptors (aKIRs: 1-4 vs. 5-6).

vtree(stemcell, "cmv", summary="aKIRs \naKIRS mean (SD) %mean% (%SD%)", 
  labelnode=list(cmv=c("No"=0,"Yes"=1)), labelvar=c(cmv="CMV reactivation"))

Dichotomizing a quantitative variable

vtree(stemcell, "aKIRs>=5", summary="cmv \nR: %npct%")

Thanks!

For more information
vtree is available on CRAN
See also the vtree webpage: https://nbarrowman.github.io/vtree
Acknowledgements
Richard Webster
Sebastian Gatscha
Questions?